Sign in to confirm you’re not a bot

This helps protect our community. Learn more

Introduction to Snorkel by Stephen Bach

Mobilize Center

792 subscribers

4.2K views 7 years ago

Snorkel (https://hazyresearch.github.io/snorkel/) is a tool that automatically extracts information from unstructured data sources, such as the scientific literature and clinical notes, without using large, labeled training datasets, which are often lacking in biomedicine. In this workshop, participants learned about the Snorkel workflow through brief lectures and hands-on activities. This included: Writing labeling functions using pattern-matching and comparisons against existing dictionaries (e.g., Unified Medical Language System) Fitting and assessing a model to the labeling functions to generate the training data Hearing about examples of problems that can and cannot be addressed with Snorkel

...more

...more

These chapters are auto-generated

Intro

Outline

Dark Data Extraction (DDE)

Extraction from the Scientific Literature

The Need for Lightweight Extraction

Example: Chemical-Disease Relation Extraction from Text

The Advent of Representation Learning Deep learning is achieving state-of-the-art results

Relation Extraction with Machine Learning

Training Data Creation: $$$, Slow, Static

Jupyter Interface

Labeling Functions

Data Programming Pipeline in Snorkel

Does Modeling the Noise Help?

Conclusion

Introduction to Snorkel by Stephen Bach

53Likes

4,240Views

2017Sep 12

Snorkel (https://hazyresearch.github.io/snorkel/) is a tool that automatically extracts information from unstructured data sources, such as the scientific literature and clinical notes, without using large, labeled training datasets, which are often lacking in biomedicine. In this workshop, participants learned about the Snorkel workflow through brief lectures and hands-on activities. This included: Writing labeling functions using pattern-matching and comparisons against existing dictionaries (e.g., Unified Medical Language System) Fitting and assessing a model to the labeling functions to generate the training data Hearing about examples of problems that can and cannot be addressed with Snorkel

Transcript

Follow along using the transcript.

Mobilize Center

792 subscribers